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A finite state Markov process is aggregated into several groups . Rather than observing the underlying Markov process , 
one is only able to observe the aggregated process. What can be learned about the underlying process from the 
aggregated one? Such questions arise in the study of gating mechanisms in ion channels in muscle and nerve cell 
membranes. We discuss some recent results and their implications. 
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1. Introduction 

We are concerned with the following mathematical prob- 
lem: a finite state Markov process in continuous time is 
aggregated, by which we mean that its states are grouped 
into a smaller number of aggregates. The Markov process is 
assumed to be in equilibrium. One is not able to observe the 
process itself, but only what aggregate the process is in as 
time goes along. From the aggregated process we wish to 
draw inferences about the underlying Markov process. How 
much can we leam? This is a rather general question and 
more sharply defined questions can be asked. For example, 
to what extent is the graph structure of the Markov process 
identifiable? Are some aspects of it identifiable and some 
not? (By the graph we mean a diagram showing the states 
and the interconnections between them, but not the numeri- 
cal values of the transition rates.) Can some graph structures 
be ruled out as incompatible with the aggregated process? If 
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the graph structure is known or hypothesized a priori , what 
functional of the various rate constants can be identified? 

These are questions of identifiability. There are also prob- 
lems related to the efficient statistical use of observation of 
the aggregated process over a finite time interval. 

Problems such as these arise in modeling and data analy- 
sis for biophysical studies of gating mechanisms in ion chan- 
nels in muscle and nerve cell membranes. The next section 
of this paper briefly describes the biophysical context. In the 
third section we summarize the mathematical results that we 
have been able to obtain and discuss their implications and 
possible applications. In the fourth section we make some 
concluding remarks and mention several open questions. 

2. Biophysical Background 

Ion channels are transmembrane proteins with the ability 
to open a pore through which ions can flow with high con- 
ductance; in the absence of such pores, the lipid bilayer 
membranes of cells are virtually impermeable to most 
charged particles. Most channels are either voltage "gated" 
(controlled), such as the sodium channel in nerve axons, 
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which is the fundamental non-linear circuit element in- 
volved in the propagation of nerve impulses, or chemically 
gated, such as the post-synaptic acetylcholine receptor. We 
have already reviewed the biophysical context [l] 1 and a 
broad review of ion channels is now available [2] . We shall 
therefore confine ourselves here to an illustrative example. 

The present work started with the desire to extract infor- 
mation from experiments on the chemically gated acetyl- 
choline receptor. Chemically extracted protein is incorpo- 
rated in an artificial lipid bilayer membrane separating two 
compartments containing electrolyte solution. The voltage 
difference between the two sides of the membrane is fixed, 
and the current through the membrane is measured. In the 
presence of an agonist (chemical stimulant, such as acetyl- 
choline), the current is found to fluctuate randomly between 
two levels, reflecting the open or closed state of the channel. 
(Great pain is taken to arrange to have only one active 
channel, as shown by the absence of time intervals with 
current a multiple of the minimum quantum.) It is known 
that the agonist must bind to the channel to permit opening, 
and the time scale for channel opening and closing is much 
shorter than the time scale for agonist binding, as shown by 
chemical kinetics studies, so the simplest kinetics would be 
a Markov process with three states: Cj has no bound agonist 
and the channel is closed, C 2 has bound agonist and the 
channel is still closed, and O has bound agonist and the 
channel is open, and transitions C^C 2 <-*0. The transition 
C 2 *^>0 is visible in the experiment as a jump in electric 
current through the membrane. The transition C^C 2 , 
which involves the binding or dissociation of agonist, is 
invisible because it does not involve a change in channel 
conductance. Under these circumstances, we have the sim- 
plest example of an aggregated Markov process, with aggre- 
gates {C lt C 2 } and {O}. We would like to see if the scheme 
Ci**C 2 **0 is consistent with the data, and, if it is, we 
would like to estimate the four transition rates in the scheme 
from the data. In this particular case, it is easy to see, using 
the results described in the next section, that the transition 
rates can be estimated from the data, and in fact it is suffi- 
cient to use the one dimensional densities for this purpose. 

The model we have just described is radically oversimpli- 
fied and inconsistent with experiment. One feature which is 
accessible via these experiments but not via agonist binding 
studies is that the one dimensional density for the channel 
open "state" (actually, aggregate) is the sum of at least two 
exponentials [3], According to the next section, this demon- 
strates the existence of at least two open states : and 2 - 
The aggregates are now {C ]f CJ and {O u Oj). We would 
like to accept or reject schemes like C^C 2 ^*Oi<-*0 2 , 
{C x <r>C 2 <*O u C 2 ^*Oi), and C^C^O^Oj^C^ The 
theorems of the next section imply immediately that the one 
dimensional densities contain all the information available if 



any of these schemes is correct, that the third scheme, which 
contains a cycle, is not identifiable, and that any of these 
schemes can be excluded if correlation if observed between 
two consecutive durations of channel opening, which it is 
[4,5]. 



3. Results 

We first introduce some notation. P(t) will denote the 
transition matrix of the Markov process. We will assume 
throughout that the process is in equilibrium. As is standard, 
we let 



_lim P(f)~I 



q=; 



>0 



, where / is the identity matrix. 



The aggregates will be indexed by lower case Greek letters; 
n a is the number of states in aggregate a. We order the states 
so that states in the same aggregate are contiguous, and we 
partition the matrix Q into sub-matrices Q a $. We will as- 
sume throughout that the submatrices Q aa are diagonaliz- 
able, which holds if the law of detailed balance is valid for 
the system. 

Supposing that the process enters aggregate a at time 
t =0, we denote the probability density of the length of time 
T spent in that aggregate before leaving by f a (t). It is shown 
in [6] and [1] that 

where q a —^ +CL Q a ^,u^ is a column vector of n p ones, and 
iT a is a row vector giving the probabilities that aggregate a 
is entered via each of its states. 

Under the assumption that Q aa is diagonalizable this one 
dimensional density can be re-expressed as: 



i = \ 



■\y 



'Numbers in brackets indicate literature references. 



An implication of this result is that a lower bound on n a 
can be obtained by counting the number of exponential 
components. This result has been widely used in channel 
gating studies by fitting experimental results to sums of 
exponentials and judging how many exponentials to include 
by a chi-squared statistic. 

Two dimensional densities can also be used to obtain 
information about the Markov process. It can be shown that 
the density of spending a length of time s in aggregate a and 
then a length of time / in aggregate 3 is 

/«p(*.'>=22 <V"^~ V • 
(Here we use i aadj as indices, not powers.) It is noteworthy 



518 



that the same exponential parameters occur in both the one 
and two dimensional densities. This might be used to judge 
the plausibility of an underlying Markov process model. 

The matrix of linear coefficients with ij entry a$$ yields 
information about how the aggregates are interconnected: 
Theorem A. [7] LetA a p=[agp] be the matrix of coefficients 
of a two dimensional density above. Then the rank ofA a $ is 
less than or equal to the rank of Q a g=p a p, say, and A a $ 
depends on at most p a ^(n a +n^—p a ^) parameters. The rank 
of 2„p is less than or equal to the smaller of: the number of 
states in a which are linked directly to states in 0, and the 
number of states in p which are linked directly to states in 
a. It may be possible to empirically obtain a lower bound on 
the complexity of interconnection by fitting two dimen- 
sional densities and using chi-square tests. We hope to im- 
plement and test such a procedure in the future. 

Higher dimensional densities can be considered also. The 
following theorem shows that under certain conditions no 
more information about the process can be obtained from 
these densities, however. 

Theorem B. [7] If, for each aggregate a, the eigenvalues 
A4 are distinct and a'^Ofor all i, the higher dimensional 
densities f (t , . . . ,t r ), r>l are completely deter- 

mined by the two-dimensional densities. 

Thus, in principle, all the available information about the 
underlying process can be extracted from the two dimen- 
sional densities if the hypotheses of the theorem are satis- 
fied. By counting the number of independent parameters 
involved in the two dimensional densities, we have the 
following theorem: 

Theorem C. [7] Under the assumptions of Theorem B, the 
finite dimensional distributions depend on at most 2 
P a $( n a +n p-p a p) parameters. 

Thus for example, if there are two aggregates, there is 
information on at most 2p(n a +n$— p) parameters. If a 
model depends on more than this many parameters, its 
parameters are not uniquely identifiable. 

Theorem A above suggests one way to study the complex- 
ity of interconnection between two aggregates. Labarca et. 
al [5] have also used certain correlation functions for this 
purpose. For a particular aggregate a, say, the sequence of 
dwell times in that aggregate, T l ,T 2 ^--- is a stationary proc- 
ess, and it can be shown that the covariance function of that 
process is of the form given in the following theorem: 
Theorem D. [1] The covariance function is of the form 

M-l 

j=i 

where 0<k,</ and ki=0 and where M is the rank of the 
matrix Q a ■ which is composed of the off-diagonal blocks 
corresponding to aggregate a in the matrix Q. If M=l, 
T(k)=0fork^0. 



The rank of Q a • is less than or equal to the smaller of: the 
number of states in alpha which are linked directly to states 
in any other aggregate, and the number of states in other 
aggregates which are linked directly to states in a. 

Correlation functions can thus be used to obtain informa- 
tion similar to that in the two dimensional densities as in 
Theorem A. In the case that there are more than two aggre- 
gates, the two dimensional densities contain finer informa- 
tion, however, since a lower bound on the rank of each of 
the matrices Q a $ which constitute Q a can be obtained. 

It is interesting that the covariance function of Theorem 
D is of the form of the covariance function of a moving 
average-autoregressive process, although the stationary 
process is distinctly non-Gaussian. It may well be that tech- 
niques developed for order estimation in the time series 
literature can be used to estimate M . Labarca et al. [5] were 
primarily interested in testing whether M was greater than 1, 
which is relatively simple since the large sample distribution 
of the empirical correlation coefficients in the case M = 1 
can be used. 



4. Further Considerations 

We believe that although the results summarized in the 
previous section are useful, there are still many open and 
interesting questions. 

We have developed some necessary conditions for identi- 
fiability which can sometimes be used to conclude that hy- 
pothetical models are unidentifiable. It would be useful to 
have checkable sufficient conditions as well. 

Our analysis applies to stationary Markov processes. The 
nonstationary case is of both theoretical and practical inter- 
est, and we hope to consider that situation in the future. 
Records from the sodium channel are typically nonstation- 
ary because of the presence of an absorbing inactivated 
state. 

We have only begun to explore practical consequences of 
our results for the analysis of experimental data [3]. It is 
tempting to consider estimating the two dimensional distri- 
butions and basing further analysis on them. Horn and 
Lange [8] have proposed likelihood analysis of sample paths 
and Horn and Vandenburg [9] have applied these techniques 
to data from the sodium channel. Advantages of their ap- 
proach are that it is applicable to nonstationary data and 
multi-channel data. The likelihood method is computation- 
ally intensive, however; Horn uses an array processor on a 
VAX 11/730 and reports that days of computer time are 
necessary. An analysis based on the two dimensional distri- 
butions would be much faster; it is not clear what loss of 
statistical efficiency would be incurred. 

Finally, it would be desirable to develop more data- 
analytic and model-free methods for analyzing experimental 
records to give qualitative insights that might suggest phys- 
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ical mechanisms. Labarca et al. [3] have used box-plots to 
advantage in analyzing data from the chloride channel. 
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